Materialized Views

您所在的位置:网站首页 calcite cbo Materialized Views

Materialized Views

2023-05-25 15:45| 来源: 网络整理| 查看: 265

Materialized Views

There are several different ways to exploit materialized views in Calcite.

Materialized views maintained by Calcite Expose materialized views to Calcite View-based query rewriting Substitution via rules transformation Rewriting using plan structural information Rewriting coverage Join rewriting Aggregate rewriting Aggregate rewriting (with aggregation rollup) Query partial rewriting View partial rewriting Union rewriting Union rewriting with aggregate Limitations References Materialized views maintained by Calcite

For details, see the lattices documentation.

Expose materialized views to Calcite

Some Calcite adapters as well as projects that rely on Calcite have their own notion of materialized views.

For example, Apache Cassandra allows the user to define materialized views based on existing tables which are automatically maintained. The Cassandra adapter automatically exposes these materialized views to Calcite.

Another example is Apache Hive. When a materialized view is created in Hive, the user can specify whether the view may be used in query optimization. If the user chooses to do so, the materialized view will be registered with Calcite.

By registering materialized views in Calcite, the optimizer has the opportunity to automatically rewrite queries to use these views.

View-based query rewriting

View-based query rewriting aims to take an input query which can be answered using a preexisting view and rewrite the query to make use of the view. Currently, Calcite has two implementations of view-based query rewriting.

Substitution via rules transformation

The first approach is based on view substitution. SubstitutionVisitor and its extension MaterializedViewSubstitutionVisitor aim to substitute part of the relational algebra tree with an equivalent expression which makes use of a materialized view. The scan over the materialized view and the materialized view definition plan are registered with the planner. Afterwards, transformation rules that try to unify expressions in the plan are triggered. Expressions do not need to be equivalent to be replaced: the visitor might add a residual predicate on top of the expression if needed.

The following example is taken from the documentation of SubstitutionVisitor:

Query: SELECT a, c FROM t WHERE x = 5 AND b = 4 Target (materialized view definition): SELECT a, b, c FROM t WHERE x = 5 Result: SELECT a, c FROM mv WHERE b = 4

Note that result uses the materialized view table mv and a simplified condition b = 4.

While this approach can accomplish a large number of rewritings, it has some limitations. Since the rule relies on transformation rules to create the equivalence between expressions in the query and the materialized view, it might need to enumerate exhaustively all possible equivalent rewritings for a given expression to find a materialized view substitution. However, this is not scalable in the presence of complex views, e.g., views with an arbitrary number of join operators.

Rewriting using plan structural information

In turn, an alternative rule that attempts to match queries to views by extracting some structural information about the expression to replace has been proposed.

MaterializedViewRule builds on the ideas presented in [GL01] and introduces some additional extensions. The rule can rewrite expressions containing arbitrary chains of Join, Filter, and Project operators. Additionally, the rule can rewrite expressions rooted at an Aggregate operator, rolling aggregations up if necessary. In turn, it can also produce rewritings using Union operators if the query can be partially answered from a view.

To produce a larger number of rewritings, the rule relies on the information exposed as constraints defined over the database tables, e.g., foreign keys, primary keys, unique keys or not null.

Rewriting coverage

Let us illustrate with some examples the coverage of the view rewriting algorithm implemented in MaterializedViewRule. The examples are based on the following database schema.

CREATE TABLE depts( deptno INT NOT NULL, deptname VARCHAR(20), PRIMARY KEY (deptno) ); CREATE TABLE locations( locationid INT NOT NULL, state CHAR(2), PRIMARY KEY (locationid) ); CREATE TABLE emps( empid INT NOT NULL, deptno INT NOT NULL, locationid INT NOT NULL, empname VARCHAR(20) NOT NULL, salary DECIMAL (18, 2), PRIMARY KEY (empid), FOREIGN KEY (deptno) REFERENCES depts(deptno), FOREIGN KEY (locationid) REFERENCES locations(locationid) ); Join rewriting

The rewriting can handle different join orders in the query and the view definition. In addition, the rule tries to detect when a compensation predicate could be used to produce a rewriting using a view.

Query: SELECT empid FROM depts JOIN ( SELECT empid, deptno FROM emps WHERE empid = 1) AS subq ON depts.deptno = subq.deptno Materialized view definition: SELECT empid FROM emps JOIN depts USING (deptno) Rewriting: SELECT empid FROM mv WHERE empid = 1 Aggregate rewriting Query: SELECT deptno FROM emps WHERE deptno > 10 GROUP BY deptno Materialized view definition: SELECT empid, deptno FROM emps WHERE deptno > 5 GROUP BY empid, deptno Rewriting: SELECT deptno FROM mv WHERE deptno > 10 GROUP BY deptno Aggregate rewriting (with aggregation rollup) Query: SELECT deptno, COUNT(*) AS c, SUM(salary) AS s FROM emps GROUP BY deptno Materialized view definition: SELECT empid, deptno, COUNT(*) AS c, SUM(salary) AS s FROM emps GROUP BY empid, deptno Rewriting: SELECT deptno, SUM(c), SUM(s) FROM mv GROUP BY deptno Query partial rewriting

Through the declared constraints, the rule can detect joins that only append columns without altering the tuples multiplicity and produce correct rewritings.

Query: SELECT deptno, COUNT(*) FROM emps GROUP BY deptno Materialized view definition: SELECT empid, depts.deptno, COUNT(*) AS c, SUM(salary) AS s FROM emps JOIN depts USING (deptno) GROUP BY empid, depts.deptno Rewriting: SELECT deptno, SUM(c) FROM mv GROUP BY deptno View partial rewriting Query: SELECT deptname, state, SUM(salary) AS s FROM emps JOIN depts ON emps.deptno = depts.deptno JOIN locations ON emps.locationid = locations.locationid GROUP BY deptname, state Materialized view definition: SELECT empid, deptno, state, SUM(salary) AS s FROM emps JOIN locations ON emps.locationid = locations.locationid GROUP BY empid, deptno, state Rewriting: SELECT deptname, state, SUM(s) FROM mv JOIN depts ON mv.deptno = depts.deptno GROUP BY deptname, state Union rewriting Query: SELECT empid, deptname FROM emps JOIN depts ON emps.deptno = depts.deptno WHERE salary > 10000 Materialized view definition: SELECT empid, deptname FROM emps JOIN depts ON emps.deptno = depts.deptno WHERE salary > 12000 Rewriting: SELECT empid, deptname FROM mv UNION ALL SELECT empid, deptname FROM emps JOIN depts ON emps.deptno = depts.deptno WHERE salary > 10000 AND salary 10000 GROUP BY empid, deptname Materialized view definition: SELECT empid, deptname, SUM(salary) AS s FROM emps JOIN depts ON emps.deptno = depts.deptno WHERE salary > 12000 GROUP BY empid, deptname Rewriting: SELECT empid, deptname, SUM(s) FROM ( SELECT empid, deptname, s FROM mv UNION ALL SELECT empid, deptname, SUM(salary) AS s FROM emps JOIN depts ON emps.deptno = depts.deptno WHERE salary > 10000 AND salary


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3